matrix-vector product
Gradients of Functions of Large Matrices
Tuning scientific and probabilistic machine learning models - for example, partial differential equations, Gaussian processes, or Bayesian neural networks - often relies on evaluating functions of matrices whose size grows with the data set or the number of parameters. While the state-of-the-art for evaluating these quantities is almost always based on Lanczos and Arnoldi iterations, the present work is the first to explain how to differentiate these workhorses of numerical linear algebra efficiently. To get there, we derive previously unknown ad-joint systems for Lanczos and Arnoldi iterations, implement them in JAX, and show that the resulting code can compete with Diffrax when it comes to differentiating PDEs, GPyTorch for selecting Gaussian process models and beats standard factorisation methods for calibrating Bayesian neural networks. All this is achieved without any problem-specific code optimisation.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
- Europe > Denmark > Capital Region > Kongens Lyngby (0.14)
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
- (4 more...)
- Asia > Middle East > Jordan (0.04)
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- (2 more...)
- Asia > Afghanistan > Parwan Province > Charikar (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- (2 more...)
- Asia > Middle East > Israel > Haifa District > Haifa (0.04)
- Europe > France > Bourgogne-Franche-Comté > Doubs > Besançon (0.04)
Faster Linear Algebra for Distance Matrices
The distance matrix of a dataset $X$ of $n$ points with respect to a distance function $f$ represents all pairwise distances between points in $X$ induced by $f$. Due to their wide applicability, distance matrices and related families of matrices have been the focus of many recent algorithmic works. We continue this line of research and take a broad view of algorithm design for distance matrices with the goal of designing fast algorithms, which are specifically tailored for distance matrices, for fundamental linear algebraic primitives. Our results include efficient algorithms for computing matrix-vector products for a wide class of distance matrices, such as the $\ell_1$ metric for which we get a linear runtime, as well as an $\Omega(n^2)$ lower bound for any algorithm which computes a matrix-vector product for the $\ell_{\infty}$ case, showing a separation between the $\ell_1$ and the $\ell_{\infty}$ metrics. Our upper bound results in conjunction with recent works on the matrix-vector query model have many further downstream applications, including the fastest algorithm for computing a relative error low-rank approximation for the distance matrix induced by $\ell_1$ and $\ell_2^2$ functions and the fastest algorithm for computing an additive error low-rank approximation for the $\ell_2$ metric, in addition to applications for fast matrix multiplication among others. We also give algorithms for constructing distance matrices and show that one can construct an approximate $\ell_2$ distance matrix in time faster than the bound implied by the Johnson-Lindenstrauss lemma.
- Asia > Middle East > Jordan (0.04)
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- (2 more...)
- North America > United States > Illinois > Cook County > Chicago (0.04)
- North America > United States > California > Alameda County > Berkeley (0.04)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Accelerating Natural Gradient Descent for PINNs with Randomized Nyström Preconditioning
Bioli, Ivan, Marcati, Carlo, Sangalli, Giancarlo
Natural Gradient Descent (NGD) has emerged as a promising optimization algorithm for training neural network-based solvers for partial differential equations (PDEs), such as Physics-Informed Neural Networks (PINNs). However, its practical use is often limited by the high computational cost of solving linear systems involving the Gramian matrix. While matrix-free NGD methods based on the conjugate gradient (CG) method avoid explicit matrix inversion, the ill-conditioning of the Gramian significantly slows the convergence of CG. In this work, we extend matrix-free NGD to broader classes of problems than previously considered and propose the use of Randomized Nyström preconditioning to accelerate convergence of the inner CG solver. The resulting algorithm demonstrates substantial performance improvements over existing NGD-based methods and other state-of-the-art optimizers on a range of PDE problems discretized using neural networks.
- Europe > Italy (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- North America > United States > New York (0.04)
- (2 more...)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
- Europe > Denmark > Capital Region > Kongens Lyngby (0.14)
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
- (4 more...)
- Information Technology > Mathematics of Computing (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.92)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)